# A tibble: 5 × 2
date_onset hospital
<date> <chr>
1 2014-05-18 Port Hospital
2 2014-05-21 Military Hospital
3 2014-05-22 Port Hospital
4 2014-06-06 Port Hospital
5 2014-06-13 Military Hospital
Módulo 2: Trabalhando com dados
PROFESP, DEMSP, MS
Funções join()
O pacote stringR
Mais sobre o pacote flextable: título, cabeçalho e fonte
# A tibble: 5 × 2
date_onset hospital
<date> <chr>
1 2014-05-18 Port Hospital
2 2014-05-21 Military Hospital
3 2014-05-22 Port Hospital
4 2014-06-06 Port Hospital
5 2014-06-13 Military Hospital
# A tibble: 5 × 3
Hospital N_residentes level
<fct> <dbl> <chr>
1 central hospital 1950280 Tertiary
2 military hospital 40500 Secondary
3 military hospital 10000 Primary
4 port hospital 50280 Secondary
5 central hospital 12000 Secondary
Joins cássicos (não probabilísticos)
Preciso de pelo menos de uma coluna “chave” para união.
Preciso Garantir
mesmo nome de coluna (é melhor)
mesma classe de coluna
nomes da chave que correspondam com exatidão
# A tibble: 5 × 2
date_onset hospital
<date> <chr>
1 2014-05-18 Port Hospital
2 2014-05-21 Military Hospital
3 2014-05-22 Port Hospital
4 2014-06-06 Port Hospital
5 2014-06-13 Military Hospital
# A tibble: 5 × 3
Hospital N_residentes level
<fct> <dbl> <chr>
1 central hospital 1950280 Tertiary
2 military hospital 40500 Secondary
3 military hospital 10000 Primary
4 port hospital 50280 Secondary
5 central hospital 12000 Secondary
pacote dplyr
rename()
Dá para encadear com o pipe %>%
R base
names()
deve ter aspas e também todos os nomes
# A tibble: 5 × 2
Inicio_sint Hospital
<date> <chr>
1 2014-05-18 Port Hospital
2 2014-05-21 Military Hospital
3 2014-05-22 Port Hospital
4 2014-06-06 Port Hospital
5 2014-06-13 Military Hospital
# A tibble: 5 × 3
Hospital N_residentes level
<fct> <dbl> <chr>
1 central hospital 1950280 Tertiary
2 military hospital 40500 Secondary
3 military hospital 10000 Primary
4 port hospital 50280 Secondary
5 central hospital 12000 Secondary
As colunas “chave” devem ser do mesmo tipo.
Vemos que nesse caso, uma é fator e a outra um caractere.
# A tibble: 5 × 2
Inicio_sint Hospital
<date> <chr>
1 2014-05-18 Port Hospital
2 2014-05-21 Military Hospital
3 2014-05-22 Port Hospital
4 2014-06-06 Port Hospital
5 2014-06-13 Military Hospital
# A tibble: 5 × 3
Hospital N_residentes level
<chr> <dbl> <chr>
1 central hospital 1950280 Tertiary
2 military hospital 40500 Secondary
3 military hospital 10000 Primary
4 port hospital 50280 Secondary
5 central hospital 12000 Secondary
Os nomes não estão escritos da mesma forma
Funções úteis
# A tibble: 5 × 2
Inicio_sint Hospital
<date> <chr>
1 2014-05-18 port hospital
2 2014-05-21 military hospital
3 2014-05-22 port hospital
4 2014-06-06 port hospital
5 2014-06-13 military hospital
# A tibble: 5 × 3
Hospital N_residentes level
<chr> <dbl> <chr>
1 central hospital 1950280 Tertiary
2 military hospital 40500 Secondary
3 military hospital 10000 Primary
4 port hospital 50280 Secondary
5 central hospital 12000 Secondary
Podemos seguir com o join!
# A tibble: 4 × 4
Inicio_sint Hospital N_residentes level
<date> <chr> <dbl> <chr>
1 2014-05-18 port hospital 50280 Secondary
2 2014-05-21 military hospital 40500 Secondary
3 2014-05-21 military hospital 10000 Primary
4 2014-05-22 port hospital 50280 Secondary
left_join()
right_join()
inner_join()
full_join()
semi_join()
anti_join()